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EXPERT  SYSTEM  RULE- BASE  EVALUATION 
USING  REAL-TIME  PARALLEL  PROCESSING 

James  L.  Noyes 


>UCTION 


The  value  of  a  rule-based  expert  system  (ES)  to  help  solve  a 
variety  of  diagnostic  and  advisory  needs  has  been  well -demonstrated 
over  the  last  2  decades,  as  discussed  by  Noyes  in  [1] .  Sometimes, 
a  large  nvimber,  say  O(IO^),  of  the  ES  rules  must  be  continuously 
checked  in  real-time  (e.g.,  every  0(10*^)  seconds)  due  to  stringent 
requirements  iitposed  by  the  problem.  In  addition,  while  each  rule 
may  use  only  O(10M  criteria,  there  may  be  a  very  large  number  of 
possible  criteria,  say  0(10®) ,  for  the  entire  rule-base  that  must 
be  checked  during  each  time-step.  Because  of  these  timing  demands, 
parallel  processing  may  be  deemed  necessary.  Parallel  processing 
has  become  increasingly  iirportant  in  order  to  accelerate  a  variety 
of  computations,  as  discussed  by  Noyes  in  [1]  and  Trippi  and  Turban 
in  [2] .  This  report  discusses  research  connected  to  the 
development  of  a  data  structure  and  an  algorithm  to  evaluate  this 
type  of  rule-base  and  the  estimation  of  the  processor  speeds 
necessary  to  evaluate  these  rules  within  the  required  time.  The 
particular  application  for  this  real-time  ES  is  a  rule-base  to  aid 
the  pilot  of  modern  fighter  or  transport  aircraft  and  the  remainder 
of  this  report  will  address  this  application.  However,  the  results 
of  the  research  presented  here  could  be  used  in  other  applications. 
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All  assunqptions  in  this  report  on  sensor  update  rates,  niunber 
of  sensors,  and  other  matters  that  determine  real-time  response  are 
based  on  conversations  with  F-4  pilots  and  program  managers  working 
aircraft  systems  applications.  The  numbers  are  application- 
specific.  The  actual  numbers  for  the  application  at  hand  and  the 
configuration  of  the  available  processing  hardware  determine  system 
ability  to  respond  in  real-time.  The  simulation  and  inferencing 
methods  developed  in  this  report  are  designed  to  enable  system 
expandability  to  ensure  real-time  performance. 

Throughout  this  report  the  notation  "©(x)"  is  found.  This 
means  "on  the  order  of  x.'  The  notation  "O(10M"  means  "on  the 
order  of  10*  or  "somewhere  in  the  neighborhood  of  10." 
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This  ES  rule-base  formulation  depends  upon  a  state  vector,  a 
criteria  vector,  a  response  (action)  vector,  and  a  set  of  rules. 

The  aircraft  state  vector  •  consists  of  z  continuous  and 
discrete  components  (state  variables)  conqpletely  describing  the 
state  of  the  aircraft  at  a  given  time-step  t^,  of  magnitude  O(10  M 
seconds.  These  values  are  determined  by  a  collection  of  o:i-board 
sensors  and  there  may  be  O(IO^)  of  these.  For  example,  state 
variable  S12  might  represent  the  niimber  of  pounds  of  fuel  currently 
in  the  fuel  cells.  The  aircraft  criteria  vector  c  is  a  vector  of 
m  Boolean  (True  or  False)  variables.  Each  of  these  variables  is 
based  upon  a  value  of  one  or  more  of  the  variables  in  the  state 
vector.  For  example,  criterion  C33  might  represent  the  relationship 
between  the  current  amount  of  fuel  and  a  minimum  fuel  reserve 
(e.g.,  C33  =  (Si2  <.  fnl )  and  C33  is  True  when  there  is  insufficient 
fuel  reserve. 

A  set  of  n  rules,  of  order  O(IO^),  defines  the  on-board  expert 
system  that  will  advise  the  pilot  and,  with  the  pilot's  consent, 
act  on  his  or  her  behalf.  Each  ES  rule  can  be  formulated  in  terms 
of  a  conjunction  of  simple  Boolean  criteria  that  lead  to  a  single 
action.  If  all  of  a  given  rule's  criteria  are  true  (based  upon  the 
elements  of  the  corresponding  criteria  vector) ,  an  action  will 
result.  This  action  could  either  be  an  activity  that  is 
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automatically  performed  for  the  pilot  or  it  could  be  a 
recommendation  to  the  pilot.  All  of  these  actions  define  an  action 
vector  a  of  size  p.  Each  rule  is  expected  to  involve  only  a 
relatively  small  number  of  m  possible  criteria.  For  this  report, 
m  is  of  order  0(10®)  .  For  example,  each  rule  may  have  up  to  10 
criteria.  The  rule-base  is  built  off-line,  and  not  modified  during 
the  search  process.  For  exanple,  a  typical  rule  might  look  like 
this  (■-■  means  "NOT"); 

Rule  R123:  actioni2  <==  Ci  &  -Cg  &  Cg  &  Cjg  &  -c^^  &  -099 

This  rule  is  interpreted  as  stating  that  actionjj  will  be  taken  if 
Cl,  Cg,  and  Cig  are  all  true  while  Cg,  C47,  and  C99  are  all  false. 

In  a  typical  ES,  the  inference  engine  performs  three  standard 
operations;  the  match  operation  matches  the  criteria  against  the 
rules  to  see  which  could  fire,  the  resolve  operation  chooses  which 
of  these  rules  will  fire,  and  the  execute  operation  actually  fires 
these  rules  and  updates  working  memory.  For  the  given  problem, 
these  operations  can  be  sinplified  into  a  simple  match-fire 
operation  with  no  resolution  operation  nor  updates. 

While  it  is  assumed  that  no  action  alters  the  criteria  vector 
c  in  any  way  at  any  time-step  t^,  it  is  possible  that  different 
rules  can  have  the  same  action.  Hence,  by  expressing  each  rule 
only  in  terms  of  sinple  AND  and  NOT  logic,  its  evaluation  can  be 
done  very  efficiently  and  independently.  (Note  that  OR  constructs 


are  equivalent  to  multiple  rules  that  specify  the  same  action.) 

Duplicate  actions  are  prevented  by  the  action  triggering 
mechanism  that  is  external  to  the  inference  engine  described  in 
this  report.  This  mechanism  sets  a  "triggered"  flag  when  the 
action  is  started.  In  a  given  update  cycle,  this  flag  can  only  be 
set  once.  All  other  attenpts  to  set  this  flag  are  ignored.  When 
the  action  is  conpleted,  the  flag  is  reset. 

Conflicting  actions  can  be  resolved  by  expanded  criteria  such 
as  "~a5i.*  This  means  that  the  rule's  consequent  action  would  not 
take  place  if  action  #51  is  underway.  This  technique  would  mean 
that  the  blackboard  would  have  to  be  updated  for  each  action  start 
and  completion.  Thus,  the  action  vector  would  have  an  associated 
"action  triggered"  vector.  This  could  be  acconplished  by  sinply 
making  the  element  in  the  action  vector  negative  for  "action 
underway"  or  positive  for  "action  not  underway." 
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3.  DATA  STRUCTURE  AND  ALGORITHM  DEVELOPMENT 

The  data  structure  and  algorithms  developed  to  evaluate  this 
ES,  are  designed  to  be  used  by  a  single  fast  processor  or  by 
parallel  processors  that  can  have  a  correspondingly  slower  clock- 
speed.  This  data  structure  utilizes  the  notion  of  a  blackboard 
that  contains  the  state  and  criteria  vectors  described  above.  In 
addition,  three  other  vectors,  the  action,  query,  and  index  vectors 
conpletely  define  the  rule-base.  Unlike  the  8  and  c  vectors,  these 
three  vectors  are  not  updated,  and  can  reside  either  in  the 
blackboard  or  some  other  data  storage  area.  The  query  vector 
contains  a  list  of  the  criteria  for  each  rule.  The  index  vector 
elements  point  to  the  criteria  that  apply  to  each  rule. 

A  blackboard  is  a  global  and  dynamic  data  base  for  the 
communication  of  independent  asynchronous  knowledge  sources  for 
related  aspects  of  a  given  problem.  The  aircraft  system  blackboard 
will  contain  the  state  vector  s  and  criteria  vector  c.  These 
vectors  will  be  updated  by  an  independent  on-board  computer  (not 
involved  in  the  rule  search)  at  each  time-step.  Each  update  of  the 
vector  c  will  immediately  initiate  a  new  evaluation  of  the  rules' 
criteria,  so  the  rule  search  must  be  complete  for  the  criteria 
vector  at  time-step  t^  before  the  criteria  vector  is  updated  at 
time-step  Hence,  this  blackboard  must  also  be  accessible  by 
the  computer  that  executes  the  rule  processing  algorithm.  Criteria 
vector  updates  are  discussed  by  Raeth  in  [3,  4] . 
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While  it  is  possible  that  a  fast  single  processor  computer 
could  be  used  on  the  aircraft,  the  most  likely  hardware 
configuration  for  the  rule  processing  algorithm  will  involve  eight 
(8)  parallel  processors.  This  number  is  convenient  since  systems 
developers  typically  place  eight  processors  on  one  board  for 
embedded  applications.  (Transputers  are  a  likely  candidate  since 
they  are  currently  available  to  the  sponsor.) 

One  of  the  eight  processors  will  serve  as  the  combined  master 
and  I/O  processor  and  will  have  one  of  its  four  serial  I/O  ports 
connected  to  the  common  data  bus  on  the  aircraft.  This  processor 
will  accept  the  criteria  vector  and  possible  pilot  input  and 
provide  the  ultimate  rule  search  output.  The  remaining  7 
processors  may  use  each  of  their  four  I/O  ports  to  connect  to  any 
other  processor.  A  preset  architecture  will  be  employed.  This 
architecture  can  be  as  sinple  as  a  ring  or  as  complex  as  a  mesh. 

Both  algorithms  presented  in  this  report  should  be  considered 
as  prototypes  and  have  been  inplemented  in  Pascal.  If  a  particular 
algorithm  is  sufficiently  successful,  it  will  eventually  be 
implemented  in  Ada  or  Ada-9X.  This  will  permit  a  ready  transition 
to  operational  aircraft  since  the  Ada  standard  has  been  mandated 
for  DoD  embedded  applications.  Mil-Std  versions  of  transputers 
exist.  Together,  transputers  programmed  in  Ada  represent  a  mature 
and  installable  parallel  processing  capability  that  takes  advantage 
of  modern  processor  architectures. 
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The  simplest  method  for  this  ES  evaluation  assumes  that  the  if- 
then-action  rules  and  their  criteria  are  listed  in  priority  order. 
This  is  equivalent  to  a  priority-oriented  backward  chaining  method. 
This  is  the  obvious  choice  when  n  «  m  and  no  other  assuirptions  are 
made  about  available  data.  (Note  that  if  these  rules  were  not 
prioritized,  then  this  first  algorithm  could  be  viewed  as  a  forward 
chaining  algorithm.)  Because  no  OR-logic  is  present  in  a  given 
rule,  the  current  rule-processor  should  stop  with  the  first 
Ci  =  False  (or  first  Ci  =  True  in  the  case  of  ~Ci)  .  If  these  rules 
were  ranked  and  evaluated  from  highest  to  lowest  priority,  then  the 
first  action  produced  (if  any)  would  be  the  most  important  from  the 
’Pilot's  point  of  view.  If  required,  different  levels  of 
parallelism  could  be  employed  during  this  evaluation  process.  If 
the  processing  time  is  not  fast  enough,  then  rules  having  the  same 
priority  could  be  grouped  according  to  their  number  of  criteria  in 
order  to  equalize  the  work  among  the  parallel  processors,  as 
discussed  by  Tout  and  Evans  in  [5] .  A  simple  example  of  a  rule- 
base  with  four  rules  is; 

Rule  Rj;  actioni  <==  Cj  &  C3  &  -04  &  C40  &  -Cgs  &  C99 

Rule  Rj:  actioni  <==  Cj  &  C4  &  C22  &  -Cgs 

Rule  R3;  action2  <==  C5  &  C99 

Rule  R4:  action3  <==  Cj  &  C50 

Note  that  Ri  is  the  highest  priority  rule  and  R4  is  the  lowest 
priority  rule.  The  criteria  are  evaluated  left  to  right. 
Evaluation  stops  as  soon  as  a  False  is  detected.  The  left-to-right 
evaluation  can  be  thought  of  as  assuming  that  the  left-most 


criteria  are  expected  to  occur  xnost  often  and  are  thus  evaluated 
first . 


These  rules  could  be  represented  efficiently  by  using  three 
vectors:  the  previously  discussed  action  vector  a  whose  elements 
each  point  to  a  specific  task  to  be  completed,  a  cruerv  vector  q, 
identifying  which  criteria  have  to  be  checked,  and  an  index  vector 
Bnd,  that  delimits  the  criteria  that  appear  in  each  of  the  rules. 
For  the  above  rule-base,  consider: 


Rule 

1: 

ACTIONi  =  Ai;  Qi  =  1,  Qj  =  3,  Qj  = 
STARTi  =  1;  ENDi  =  6 

-4,  Q4 

II 

0 

Qs  =  -98,  Qg  =  99,  SO 

Rule 

2: 

ACTIONj  =  Ai,*  Q,  =  2,  Qs  =  4,  Q, 
STARTj  =  7;  ENDj  =  10 

=  22, 

QlO  = 

-85,  so 

RULE 

3: 

ACTION3  =  A2,  Qii  =  5,  Qi2  =  99, 
StarTj  =  11;  END3  =  12 

so 

RULE 

4 : 

ACTION4  =  A3;  Qi3  -  1,  Qi4  =  50, 
START4  a  13;  END4  =  14 

so 

Here  q  employs  positive  integers  to  indicate  the  indices  of 
the  criteria  used  in  the  rules  and  negative  integers  for  the 
indices  of  the  criteria  conplements  (NOT-criteria) .  Note  also  that 
Rules  1  and  2  have  the  same  consequent.  From  the  previous  exanple, 
one  has  the  14 -element  query  vector: 


q:  I  II  ?l  -41  401-981  991  21  41  221-851  51  991  II  501 

This  allows  for  direct  and  very  fast  access  to  the  c  vector 
stored  on  the  blackboard  (only  one  internal  integer  multiplication 
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and  addition  are  needed  to  compute  any  cell  address) .  If  parallel 
processors  are  used,  this  Boolean  criteria  vector  c  can  be  accessed 
from  the  blackboard  by  all  processors.  If  multicoirputers  are  used, 
c  would  be  communicated  to  the  local  memory  of  each  processor  and 
this  communication  time  will  need  to  be  considered,  according  to 
Lester  in  [6] .  Each  processor  also  must  use  components  from  the 
query  vector  q.  Note  the  relationship  Startj*i  =  End^  +  1  with 
Starti  =  1,  so  only  the  Bnd  unsigned  integer  index  vector  is 
actually  needed  by  the  algorithm.  In  this  example,  one  has; 


Bnd:  I  61  101  121  141  which  implies  Start:  I  II  71  111  13 i 

Note  that  vector  q  has  a  nximber  of  elements  equal  to  the  sum 
of  the  number  of  criteria  queried  by  each  rule.  Vector  Bnd  has  n 
elements,  the  total  number  of  rules. 
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This  method  yields  Aloorithm-l.  presented  below,  vdiich  is  a 
relatively  simple  and  straightforward  algorithm  that  can  utilize 
these  data  structures. 


Forall  i  :=  1  to  n  do  in  parallel 
begin 

if  i  =  1 

then  j  ;=  1 

else  j  :=  Endi.^  +  1; 

Fired  ;=  TRUE; 
while  j  ^  Endi  and  Fired  do 
begin 

k  :=  Qj; 

if  k  0  and  not  c^  then 
Fired  :=  FALSE 
else  if  k  <  0  and  c.k  then 
Fired  ;=  FALSE; 
j  :=  j  +  1 
end; 

if  Fired  then  perform  action  a^ 

end 


In  Algorithm-1,  the  Forall  statement  creates  up  to  n  parallel 
processes.  If  p  is  the  number  of  parallel  processors  and  p  ^  n, 
then  this  loop  completes  as  soon  as  the  slowest  of  these  processes 
has  finished  execution.  Here  the  total  parallel  processing  time  at 
a  given  time-step  is  the  maximum  of  these  times.  If  p  <  n,  then 
the  next  available  processor  would  evaluate  the  next  unprocessed 
rule,  hence  the  total  parallel  processing  time  at  a  given  time-step 
is  then  the  maximum  of  all  sums  of  the  individual  processor  times. 
Notice  that  this  reduces  to  a  normal  sequential  processing 
algorithm  when  p  =  1. 
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For  exeuiqple,  if  p  ^  4  and  it  takes  an  estimated  average  of  50 
microseconds  to  check  each  criterion  in  the  previous  4 -rule 
situation,  then  4  copies  of  the  loop  body  will  be  created  on  4 
different  processors,  each  with  its  own  value  of  the  loop  control 
i-variable.  These  will  execute  in  parallel  with  respective  times 
of  300,  200,  100,  100  microseconds,  at  most  (as  soon  as  a  FALSE  is 
determined,  the  process  stops  for  the  current  rule) .  This  would 
then  take  at  most  300  microseconds  in  parallel  versus  at  most  700 
microseconds  if  done  sequentially,  giving  a  speedup  of  7/3  or 
approximately  2.3.  Here  the  action  performance  time  (e.g., 
displaying  an  information  screen)  was  not  considered,  nor  was 
processor-assignment  overhead  or  communication  time.  Of  course, 
any  of  these  three  times  can  have  a  significant  effect  on  this  ES 
evaluation  process. 
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5.  JffiTH9P-g 


The  previous  method  does  not  take  advantage  of  searching  in  any 
informed  way  whenever  a  state  variable  (and  hence  a  criterion) 
changes,  because  the  indexing  is  in  the  opposite  direction  from 
rule  to  criterion.  A  second,  combined  forward-backward  chaining 
method,  could  be  used  to  check  only  the  rules  whose  criteria  values 
have  changed  since  the  last  evaluation  of  the  rule-base.  To  do 
this,  one  could  also  index  in  the  opposite  direction,  checking  only 
the  rules  having  newly  changed  (currently  "active")  criteria.  The 
forward  phase  identifies  the  changed  criteria  and  rules  that  use 
these  criteria.  The  backward  phase  is  the  same  as  before  with 
presumably  fewer  rules  to  process.  For  example,  using  the  same 
four  rules  as  before,  one  could  have  something  like: 


Criterion 

Ci: 

NEEDTOCHECKi 

s: 

False;  First, 

s 

1;  LASTj  =  2;  Rj  =  1, 

R2  =  4 

Criterion 

C2: 

NEEDT0CHECK2 

= 

TRUE; 

FIRST2 

= 

3;  LAST2  =  3;  Rj  =  2 

Criterion 

Cj: 

NEEDTOCHECKj 

s 

False;  First, 

s 

4 ;  Last,  =  4 ;  r,  =  1 

Criterion 

c,: 

NeedToCheck, 

true; 

First, 

s 

5 ;  Last,  =  6 ;  r,  =  1 , 

R,  =  2 

Criterion 

C5: 

NeedToChecKj 

False;  First, 

= 

7;  LAST,  =  7;  R,  =  4 

Criterion 

Cg:  NEEDTOCHECKg 
NOT  IN  ANY  RULE 

TRUE; 

First, 

s 

0 

n 

II 

0 

Criterion  c,,:  NeedToCheck,,  =  False;  First,,  =  13;  Last,,  =  14;  r^  =  1,  R^  =  3 


Assuming  criteria  Cj,  C4,  and  Cg  were  the  only  ones  that  changed 
(their  NeedToCheck  coirponents  would  be  set  to  True  in  the 
blackboard) ,  the  above  would  cause  Rulez,  Rulej,  and  Rule2  to  be 
consolidated  into  the  set  {  Rulej,  Rule2  }  with  the  conqponents 
NeedToCheck2 ,  NeedToCheck, ,  and  NeedToCheck^  being  reset  back  to 
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False.  The  efficiency  of  this  nvethod  is  related  to  the  number  of 
active  (recently  changed)  criteria  at  any  time-step.  The  number  of 
criteria  that  change  at  any  time-step  is  highly  dependent  upon  the 
application.  The  fewer  the  criteria  that  change,  the  faster  this 
method  will  be,  but  this  method  is  more  complex  and  requires  both 
more  data  and  storage  than  the  previous  method. 

Each  change  in  the  state  vector  s  at  time-step  t^  can  cause  the 
status  of  the  Boolean  criteria  vector  c  (and  its  corresponding 
NeedToCheck  vector)  to  change.  Each  criteria  vector  change,  in 
turn,  causes  a  set  (or  prioritized  list)  of  rule  numbers  to  be 
defined.  Each  rule  in  the  set  would  contain  at  least  one  of  the 
changed  criteria  and  only  the  rules  in  this  set  need  to  be  checlced 
to  see  if  all  criteria  hold.  Once  these  rules  have  been 
identified,  the  actual  criteria  checking  itself  is  done  in  the  same 
manner  as  in  Algorithm-1.  It  is  possible  to  go  further  and  only 
check  the  previously  unsatisfied  criteria  in  those  rules.  However, 
the  additional  software  complexity,  memory  utilization,  and 
execution  time  would  likely  exceed  any  savings  conpared  to  sinply 
using  Algorithm-1. 

Figure  1  summarizes  the  previous  discussion  of  Method-1  and 
Method- 2 .  In  order  to  map  the  rules  to  actions,  each  element  in 
the  Rules  ->  Action  vector  contains  a  pointer  to  an  element  in  the 
Action  vector.  A  record-oriented  data  structure  can  also  be  used 
to  implement  this  system. 
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N*«dToCh*ok  Vector  ntOfmlo^  nte 
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I _ I  I _ I  (Figure  by  Raeth) 

Figure  1:  Basic  Systems  Diagram  of  Parallel  Real-Time  Rule-Based  Expert  System 


In  practice,  one  or  more  sensor  failures  may  lead  to 
undetermined  (uncertain)  components  of  the  state  vector  s,  which 
may  lead  to  one  or  more  unknown  conponents  of  the  criteria  vector 
c.  For  any  given  rule,  one  of  three  situations  must  hold  at  time- 
step  (1)  all  its  criteria  are  known,  (2)  there  are  unknown 
criteria,  but  at  least  one  of  the  known  criteria  fails  to  be 
satisfied,  (3)  all  of  the  known  criteria  are  satisfied,  but  there 
are  still  unknown  criteria.  The  first  two  situations  are  easily 
addressed,  since  it  can  be  exactly  determined  if  the  rule  will  fire 
or  not  (in  the  second  case  it  will  not  fire)  .  In  the  third 
situation,  the  values  of  the  unknown  criteria  determine  whether  the 
rule  will  fire  or  not.  Because  of  the  possible  interdependence  of 
criteria,  it  is  very  difficult  to  determine  any  type  of  formal 
probability  or  level  of  certainty  measure  associated  with  the 
firing  of  this  rule  since  multivariable  conditional  probabilities 
are  involved.  However,  it  is  possible  to  report  a  possible  action 
by  sinply  keeping  count  of  the  number  of  criteria  that  are  unknown 
for  the  given  rule.  This  requires  that  each  conqponent  of  the 
criteria  vector  c  have  one  of  three  values  (True,  False,  Unknown) , 
instead  of  just  True  or  False  as  used  in  Algorithm-1.  A  possible 
action  occurs  if  a  rule's  criteria  are  either  True  or  Unknown. 


The  algorithm  to  do  this  is  a  variation  of  Algorithm-1,  but  is 
slightly  more  complex  and  takes  more  processing  time.  This  is 


because  an  additional  IF-test  is  needed,  and  two  additional 
counting  operations  are  necessary  for  the  reporting  when  one  or 
more  of  the  necessary  c  values  are  unknown.  The  reporting  of  the 
Ucount/Ncrit  ratio  is  intended  to  give  the  pilot  some  measure  of 
exactly  how  many  unknown  criteria  (Ucount)  exist  relative  to  the 
total  niamber  of  criteria  (Ncrit)  that  are  used  in  the  given  rule. 
For  exanqple,  if  there  are  10  criteria  in  the  rule  and  a  possible 
action  is  reported  with  a  ratio  of  1/10,  then  the  pilot  might  place 
more  confidence  in  it  than  if  a  ratio  of  7/10  was  presented. 


The  algorithm  designed  to  deal  with  this  uncertainty  is 
presented  below  as  Alaorithm-lu; 


Forall  i  ;=  1  to  n  do  in  parallel 
begin 

if  i  =  1 

then  j  :=  1 

else  j  :=  Endi.i  +  1; 

Fired  :=  TRUE; 

Ncrit  :=  0; 

Ucount  : =  0 ; 

while  j  <  Endi  and  Fired  do 
begin 

k  :=  Qj,- 

if  c,ki  is  Unknown  then 
Ucount  :=  Ucount  +  1 
else  if  k  ^  0  and  c^  is  False  then 
Fired  ;=  FALSE 

else  if  k  <  0  and  c.^  is  True  then 
Fired  ;=  FALSE; 
j  ;=  j  +  1; 

Ncrit  ;=  Ncrit  +  1 
end; 

if  Fired  then 
if  Ucount  =  0 

then  perform  action  aj 

else  report  possible  Sj  with  Ucount/Ncrit  ratio 

end 
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This  algorithm  could  also  be  modified  to  report  exactly  which 
iinknown  criteria  caused  the  problem.  When  considered  in  the  total 
application  context,  it  may  also  be  useful  to  report  the  failed 
sensors  that  caused  the  unknown  criteria. 


18 


7.  SIMULATION  GUIDELINES  AND  RESULTS 

The  software  and  hardware  realization  associated  with  the  rule 
processing  algorithm  will  depend  upon  the  amount  and  frequency  of 
the  available  data  and  the  real-time  constraints  for  the  solution. 
To  see  if  an  algorithm  is  acceptable,  it  could  be  implemented 
within  a  specially  written  Turbo  Pascal  simulation  program  such  as 
PASIM  (Pilot's  Associate  Simulator).  This  simulator  can  be  used  to 
test  Algorithm-1  and  estimate  both  the  sequential  and  parallel 
processing  speeds.  Because  of  the  interest  in  handling 
uncertainty,  the  Pilots  Associate  Reliability  Simulator  (PARSIM) 
program  was  developed  to  test  Algorithm-lu.  PARSIM  can  be  thought 
of  as  an  extension  of  PASIM  that  also  allows  the  user  to 
incorporate  an  uncertainty  percentage  that  will  also  simulate 
sensor  failures  throughout  the  flight.  In  this  section,  sanple 
simulation  results  are  given  for  both  of  these  algorithms. 
However,  most  of  the  enphasis  is  placed  upon  Algorithm-lu  as 
iirplemented  by  the  PARSIM  program.  Both  PASIM  and  PARSIM  are 
designed  to  use  a  single  processor  computer  to  simulate  a  parallel 
processing  system. 

The  current  PARSIM  program  parameters  include  (1)  a  maximian  of 
10,000  rules  that  are  assumed  to  be  in  priority  order,  (2)  a 
maximum  of  10,000  different  actions  (during  a  given  time-step, 
actions  can  be  listed  for  all  the  rules  that  are  fired),  (3)  a 
maximum  of  32,760  different  criteria  can  be  used  altogether  (this 
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is  the  largest  single  block  of  data  allowed  in  Turbo  Pascal  and 
32,767  is  the  largest  positive  integer),  (4)  a  maximum  of  8 
transputer  processors  can  be  used  (one  of  these  used  strictly  for 
I/O)  .  The  use  of  dynamic  (array)  variables  in  the  program  was 
ultimately  necessary  to  allow  the  sizes  achieved  above.  Note  that 
a  given  criteria  can  appear  in  more  than  one  rule  and  that  a 
consequent  action  can  be  triggered  by  more  than  one  rule. 

In  order  to  perform  an  effective  simulation,  one  needs  to  know 
the  processor  speed  in  (1)  evaluating  a  single  Boolean  criterion 
Ci,  and  (2)  performing  any  recommended  action  produced  by  a  rule. 
The  Inmos  transputers  to  be  simulated  are  T800-20  32-bit  models 
with  math  coprocessors.  These  transputers  have  a  clock-speed  of  20 
MHz  and  up  to  4  megabytes  of  memory  each.  A  transputer  was  not 
available  in  this  study.  However,  the  more  critical  criterion 
evaluation  speed  can  be  estimated  or  bounded  by  empirically  timing 
this  evaluation  on  the  processors  below  (all  including  math 
coprocessors) .  Here  are  the  approximate  averages  of  the  measured 
processing  speeds  required  to  evaluate  a  single  criterion  based 
upon  Algorithm- 1 ; 

Intel  80386/16MHZ:  81  microseconds  =  8.1x10"^  seconds 

Intel  80486/33MHZ:  16  microseconds  =  1.6xl0‘^  seconds 

Intel  80486/50MHZ;  12  microseconds  =  1.2x10'®  seconds 

The  386/16  processor  is  the  slowest  of  these  and  presumably  the 
closest  in  processing  speed  to  the  T800-20  transputer.  For 
Aloorithm-lu  with  the  386/16  processor,  the  approximate  average 
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criterion  evaluation  speed  was  foxind  to  be  115  microseconds  (i.e., 
it  takes  almost  42%  longer  to  evaluate  a  single  criterion) . 

PARSIM  randomly  generates  up  to  a  given  number  of  criteria  for 
each  of  n  rules.  It  also  generates  random  Boolean  values  for  m 
criteria,  and  updates  them  at  random  for  each  time-step.  Many 
random  numbers  are  required  during  a  typical  simulation.  The 
built-in  Turbo  Pascal  pseudo-random  nvimber  (PRN)  generator  called 
RANDOM,  did  not  produce  a  sufficient  amount  of  PRNs,  so  the  uniform 
(0,1)  real  full-period  PRN  generator  (iitp lamented  with  32-bit 
integers)  is  employed.  This  technique  is  discussed  by  Park  and 
Miller  in  [7]  and  by  Press,  et  al.  in  [8] .  This  is  done  with  the 
seed  update  seedt.j  ;=  16807seedt  mod  2147483647  and  producing  the 
uniform  PRN  by  using  u  ;=  seedt/2147483647 .  As  usual,  at  the 
start  of  a  typical  simulation,  the  "randomized"  initial  seed^.i  is 
obtained  from  the  system  clock. 

A  nominal  mission  length  for  a  fighter  aircraft  (such  as  an  F- 
16)  might  range  from  1-2  hours  up  to  as  many  as  5  hours  of  flying 
time.  If  one  assumes  that  sensor  updates  all  occur  at  0.1-second 
intervals,  this  dictates  the  simulation  time-step.  For  example,  a 
90  minute  flight  would  take  54,000  time-steps  (90x60x10),  and  if 
there  were  10,000  rules  with  up  to  10  criteria  per  rule  (an  average 
of  5  criteria  per  rule) ,  it  would  take  about  18  hours  to  run  the 
PASIM  code  on  a  486/33  machine.  On  that  same  coitputer,  the  PARSIM 
code  takes  longer,  for  the  reasons  already  indicated,  hence  a  much 


21 


shorter  number  of  time-steps  was  used. 

In  the  simulation,  all  rules  in  the  rule-base  are  evaluated  for 
each  time-step  in  the  flight;  this  is  to  produce  the  "worst-case" 
situation  so  that  any  necessary  processor  speed-up  can  bo 
identified.  Actions  are  triggered  as  soon  as  a  rule  fires.  It  is 
assumed  that  another  set  of  processors  performs  the  actions. 
Obviously  the  search  could  be  significantly  faster  if  the  algorithm 
could  terminate  after  the  first  action  is  triggered. 

Figure  2  indicates  essentially  what  is  shown  on  the  screen  when 
PARSIM  executes  (the  screen  size  has  more  columns  than  this  page  of 
text  so  the  wording  has  been  slightly  modified) .  The  underlined 
quantities  represent  the  simulation  input  and  output  values.  As 
with  any  simulation,  these  values  contain  a  measure  of  uncertainty. 

Specifically,  Figure  2  shows  the  input  and  output  of  a  short 
PARSIM  simulation  with  4,000  rules  with  up  to  10  criteria  per  rule 
generated.  The  number  of  unique  actions  chosen  does  not  affect  the 
simulation  and  was  arbitrarily  chosen  to  be  4,000  also.  The  input 
of  1.15e-4  indicates  the  estimated  average  time,  in  seconds,  that 
it  will  take  the  on-board  rule-processor  to  process  a  single 
criterion  (typical  of  a  386/16) .  The  action  time  is  input  as  zero, 
since  it  is  assumed  that  the  ES  triggers  another  coitqputer  to 
perform  this  action.  No  intermediate  output  is  requested  (it  is 
only  feasible  to  do  this  when  the  number  of  rules  is  very  small) . 
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Here  12,000  is  the  niiniber  of  time-steps.  Depending  upon  the  sensor 
update  time,  this  could  correspond  to  different  flight  times.  For 
exanple,  if  0.1  second  is  the  update  time,  then  0.1  x  12000  seconds 
corresponds  to  20  minutes  of  flight  time,  but  if  0.5  second  is  the 
sensor  update  time,  the  flight  is  1  hour  and  40  minutes.  The 
number  16027  is  the  total  number  of  unique  criteria  generated. 
When  the  user  enters  10,  it  indicates  that  10%  of  these  are 
expected  to  fail  before  the  end  of  the  flight. 


Rule  Simulation  Program 

This  simulates  the  real-time  processing  of  a  set  of  expert  system  rules. 
Logical  contradictions  within  the  generated  rules  are  not  guaranteed,  nor  are 

THE  ABSENCE  OF  DUPLICATE  RULES.  NEITHER  SHOULD  SIGNIFICANTLY  EFFECT  THE 
SIMULATION.  IT  IS  GUARANTEED  THAT  THERE  WILL  BE  NO  DUPLICATE  CRITERIA  IN  A 
RULE. 

INPUT; 

Enter  the  number  of  rules  to  generate  (1,  ..,10000)  :  4000 

Enter  the  number  of  different  actions  (1,  ..,  10000)  ;  4000 

Enter  the  criteria  limit  for  each  rule  out  of  32760  possible  (1,  . .  ,28761)  ;  10 

Enter  the  simulated  time  needed  to  evaluate  a  single  criterion;  1 . 15e-4 

Enter  the  simulated  time  needed  to  perform  a  single  action;  0, 

Enter  the  total  number  of  parallel  processors  ( 2 ,  . . ,  8 )  ;  ^ 

Enter  the  amount  of  intermediate  output  desired  (0  is  nominal) 

-  None(O)  ,  First  Action(I)  ,  rules  &  Acticws{2)  ,  Rules,  actions  &  Criteria (3)  ;  0. 

Enter  the  (non -negative)  number  of  simulation  time-steps;  12000 

Enter  the  uncertainty  percent  for  the  16027  criteria  generated  [0.0,100.0]  ;  10 

OUTPUT; 

The  actual  lapsed  system  clock  a-time  was  1 .499850E+0003  seconds,  with  21966 
CRITERIA  PROCESSED  AND  1723  UNIQUE  RULE(S)  OUT  OF  4725280  FIRED. 

On  average  there  were  ^  criteria  per  rule  with  3 .125E-0005  seconds 
NEEDED  TO  PROCESS  EACH  RULE  AND  1 . 250E-0001  SECONDS  FOR  THE  ENTIRE  RULE-BASE. 
The  simulated  sequential  s-time  (one  CPU)  was  1 .034389797E+0004  time  units. 
THE  SIMULATED  PARALLEL  P-TIME  (8.  PROCESSORS)  WAS  1 . 529650535E+0003  TIME  UNITS. 
MftxfA-TlME)=  1.7000E-0001.  MAXfs-TiME>=  9 .0988E-0001.  MAXfP-TlME}=  1.3720E-Q001 

Figure  2:  Rule  Simulation  Program  Sample  Screen  Output 


In  the  output,  the  a-time  is  the  total  for 
executing  PARSIM  to  process  all  the  criteria  in  all 


the  computer 
the  rules  for 
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the  entire  flight.  exaitple  in  Figure  2  was  run  on  a  486/33 
and  took  1,499.85  seconds  to  do  this.)  Its  main  purpose  is  to  help 
determine  the  overall  average  single  criterion  evaluation  time  for 
the  particular  computer  being  used.  If  the  user  enters  a  1  for  the 
single  criterion  evaluation  time,  then  the  s-time  (simulated 
sequential  time)  is  simply  a  count  of  the  number  of  criteria 
processed.  By  dividing  this  count  into  the  a-time,  one  obtains  the 
average  time  to  process  a  single  criterion  for  the  processor  on  the 
computer  currently  being  used.  By  making  several  runs  of  this  type 
(e.g.,  15  runs),  one  can  obtain  a  reasonable  estimate  of  this 
overall  average. 

The  key  output  values  are  the  last  two  given  in  Figure  2 .  They 
represent  the  maximum  simulated  sequential  and  parallel  times  over 
the  entire  flight  that  it  will  take  to  evaluate  the  rule-base. 
From  these  two  values,  it  can  be  determined  if  the  entire  rule-base 
can  be  processed  in  less  time  than  the  sensor  update  time.  For 
example,  if  the  sensor  updates  are  done  every  0.1  second,  then  the 
rule  processing  time  at  any  time-step  must  not  be  slower  that  this. 
Here  a  single  processor  takes  slightly  over  0.9  second  to  process 
the  entire  rule-base  with  the  sequential  form  of  Algorithm-1  while 
the  8-processor  parallel  version  of  the  same  algorithm  takes 
slightly  over  0.1  second.  For  this  single  simulation,  neither  the 
sequential  nor  the  8  parallel  processors  (only  7  actually 
processing  the  rules)  will  process  the  rule-base  in  an  acceptable 
amount  of  time.  However,  if  the  sensor  update  time  is  0.5  second. 
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then  the  parallel  processing  is  fast  enough  since  0.1372  <0.5,  but 
the  sequential  processing  is  still  not  fast  enough.  If  this 
algorithm  is  implemented  on  multicomputers  that  have  no  common 
memory,  then  updated  data,  such  as  the  NeedToCheck  vector,  will 
have  to  be  communicated  to  each  local  rule-processor  for  each 
sensor  update  cycle.  This  takes  an  additional  amount  of  time.  For 
exanple,  if  this  time  were  0.2  second,  the  parallel  processing  is 
still  fast  enough  since  0.3372  is  still  less  than  0.5. 

Of  course,  conclusions  such  as  the  above  should  not  be  based 
upon  just  one  simulation  run.  One  should  run  several  simulations, 
at  least  10,  with  the  same  number  of  rules  for  long  time  periods 
(e.g.,  equivalent  to  5  hours  of  flight  time)  to  draw  conclusions 
with  a  sufficient  amount  of  confidence.  Since  it  will  take 
approximately  1  hour  and  24  minutes  to  perform  this  simulation  on 
a  486/33  microconputer,  a  lot  more  running  time  is  needed. 

If  uncertainty  is  not  a  concern,  then  Algorithm-1  can  be  used 
as  inplemented  by  PASIM.  Using  the  same  inputs  as  above,  except 
that  8.1e-5  is  used  in  place  of  1.15e-4  (no  uncertainty  percent  is 
needed),  one  finds  that  the  maximum  sequential  time  is  almost  0.6 
second  while  the  maximum  parallel  time  is  under  0.09  second. 
Hence,  based  upon  just  one  run,  one  would  conclude  that  parallel 
processing  of  4,000  rules  is  fast  enough  even  when  0.1-second 
sensor  updates  are  used.  This  does  not  take  any  additional 
communication  time  into  account. 


25 


There  are  some  PAR&IM  (and  PASIM)  system  limitations  that 
should  be  mentioned:  (1)  Due  to  the  problem  requirements  as  well 
as  the  clock  precision  on  the  simulation  corr^juter  (1/lOOth  of  a 
second) ,  one  should  not  expect  reliable  timing  estimates  with  fewer 
than  1,000  rules.  (2)  The  upper  limit  is  10,000  rules,  but  if  one 
simulates  a  rule-base  of  around  10,000  rules  and  chooses  a  maximum 
of,  for  example  10  criteria  per  rule,  then  on  average  50,000 
criteria  would  have  to  be  kept  in  the  g  array.  But,  this  array  is 
limited  to  32,760  locations,  so  the  code  will  automatically  reduce 
the  number  of  criteria  per  rule  near  the  end  of  the  generated  rule- 
base  to  ensure  that  each  rule  has  at  ’east  one  criterion.  Hence, 
the  actual  average  number  of  criteria  per  rule  may  be  closer  to  3 
than  to  5  because  of  all  the  rules  that  must  have  only  one 
criterion.  It  is  up  to  the  PARSIM  user  to  determine  if  this 
average  is  realistic.  (3)  PARSIM  run  times  do  not  take  into 
account  any  processor  assignment  or  data  communication  times. 
These  depend  upon  both  the  architecture  and  hardware  being  used. 

Most  of  the  simulations  that  were  run  during  this  investigation 
used  rule-bases  of  sizes  from  1,000  up  to  6,000.  The  number  of 
distinct  actions  was  arbitrarily  input  to  be  the  same  as  the  number 
of  rules  since  this  has  no  effect  on  the  simulation  timing  at  all. 
(However,  if  one  wishes  exact  simulation  reproducibility,  this  can 
be  input  as  a  negative  number,  and  the  absolute  value  of  this 
nximber  is  used  as  the  initial  PRN  seed  instead  of  using  the  system 
clock.)  Two  different  rule  limits  were  investigated,  up  to  10 
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criteria  per  rule  and  up  to  20  criteria  per  rule.  For  example,  if 
a  rule  is  needed  to  identify  a  radar  according  to  six  pairs  of 
parameter  ranges,  then  up  to  12  criteria  may  be  needed  in  this  rule 
(e.g.,  710  ^  fi  i  855  yields  Ci  =  fi  ^  710  and  C2  =  fi  <.  855). 

Two  different  time-steps  were  also  studied:  0.1  and  0.5  second. 
These  were  the  thresholds  used  to  determine  when  the  simulated 
maximum  sequential  or  parallel  times  were  good  enough,  meaning 
smaller  than  0.1  or  0.5  second,  respectively.  The  criterion 
evaluation  times  depended  upon  the  algorithm  being  used.  As  stated 
earlier,  on  average,  it  was  found  that  Algorithm-1  used  81 
microseconds  and  Algorithm-lu  used  115  microseconds  to  evaluate  a 
single  criterion  for  the  16-MHz  Intel  386  processor.  That  is  the 
available  processor  closest  in  clock  speed  to  the  20-MHz 
transputer. 
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8.  TEST  RESULTS  AND  CONCLUSIONS 


Due  to  the  short  project  time  available  and  the  large  amounts 
of  running  time  required,  only  one  or  two  simulations  were  made  for 
each  rule  number  and  criteria  limit  combination.  All  of  the 
conclusions  below  are  based  upon  these  limited  cases  and  should  be 
viewed  accordingly.  In  particular,  for  maximum  rule-base 
evaluation  times  close  to  any  desired  threshold  (e.g.,  0.1  or  0.5), 
more  simulations  will  be  necessary. 

Based  upon  the  simulations  using  Algorithm-1  with  the  PASIM 
program  (no  uncertainty  addressed) ,  eight  parallel  processors  with 
a  clock  speed  of  16  MHz  or  faster  were  always  able  to  process  a 
rule-base  of  4,500  or  fewer  rules  having  a  maximum  of  10  criteria 
per  rule.  Single  processors  at  this  same  speed  were  unable  to  do 
this  in  under  a  tenth  of  a  second.  The  average  maximum  speed-up 
was  6.6,  using  1  master  and  7  rule-processors.  A  faster  single 
processor,  such  as  the  50-MHz  Intel  486,  would  be  able  to  process 
the  same  4,500  rules  within  this  same  time. 

Once  uncertainty  is  introduced  into  the  criteria  vector,  the 
processing  time  increases.  This  was  investigated  by  using 
Algorithm-lu  in  the  PARSIM  code.  Here  the  estimated  average  time 
to  process  a  single  criterion  goes  from  81  microseconds  to  115 
microseconds.  In  order  to  process  all  of  the  rules  within  a  tenth 
of  a  second,  with  up  to  10  criteria  per  rule,  the  simulation  showed 
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that  the  niunber  of  rules  in  the  rule-base  had  to  be  reduced  to 
2,500.  It  appears  that  up  to  2,000  rules,  each  having  up  to  20 
criteria  can  be  executed  in  parallel  under  a  tenth  of  a  second 
(with  no  extra  communication  time  taken  into  account) .  Without 
parallel  processing,  not  even  1,000  rules  could  be  evaluated.  If 
the  sensor  update  time  is  increased  to  a  half -second,  then  up  to 
6,000  rules  with  up  to  20  criteria  each  can  be  processed  in 
parallel  (even  with  a  0.2-3econd  communication  time  added). 
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